BabelDomains: Large-Scale Domain Labeling of Lexical Resources
نویسندگان
چکیده
In this paper we present BabelDomains, a unified resource which provides lexical items with information about domains of knowledge. We propose an automatic method that uses knowledge from various lexical resources, exploiting both distributional and graph-based clues, to accurately propagate domain information. We evaluate our methodology intrinsically on two lexical resources (WordNet and BabelNet), achieving a precision over 80% in both cases. Finally, we show the potential of BabelDomains in a supervised learning setting, clustering training data by domain for hypernym discovery.
منابع مشابه
Knowledge-based Supervision for Domain-adaptive Semantic Role Labeling
Semantic role labeling (SRL) is a method for the semantic analysis of texts that adds a level of semantic abstraction on top of syntactic analysis, for instance adding semantic role labels like Agent on top of syntactic functions like Subject . SRL has been shown to benefit various natural language processing applications such as question answering, information extraction, and summarization. Au...
متن کاملLeveraging Reusability: Cost-Effective Lexical Acquisition for Large-Scale Ontology Translation
Thesauri and ontologies provide important value in facilitating access to digital archives by representing underlying principles of organization. Translation of such resources into multiple languages is an important component for providing multilingual access. However, the specificity of vocabulary terms in most ontologies precludes fully-automated machine translation using general-domain lexic...
متن کاملCombining Multiple, Large-Scale Resources in a Reusable Lexicon for Natural Language Generation
A lexicon is an essential component in a generation system but few efforts have been made to build a rich, large-scale lexicon and make it reusable for different generation applications. In this paper, we describe our work to build such a lexicon by combining multiple, heterogeneous linguistic resources which have been developed for other purposes. Novel transformation and integration of resour...
متن کاملSimSem: Fast Approximate String Matching in Relation to Semantic Category Disambiguation
In this study we investigate the merits of fast approximate string matching to address challenges relating to spelling variants and to utilise large-scale lexical resources for semantic class disambiguation. We integrate string matching results into machine learning-based disambiguation through the use of a novel set of features that represent the distance of a given textual span to the closest...
متن کاملUBY - A Large-Scale Unified Lexical-Semantic Resource Based on LMF
We present UBY, a large-scale lexicalsemantic resource combining a wide range of information from expert-constructed and collaboratively constructed resources for English and German. It currently contains nine resources in two languages: English WordNet, Wiktionary, Wikipedia, FrameNet and VerbNet, German Wikipedia, Wiktionary and GermaNet, and multilingual OmegaWiki modeled according to the LM...
متن کامل